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Sir 



We, the undersigned, Yehuda Afek, Anat Bremler-Barr and 
Dan Touitou, hereby declare as follows: 

1) We are the Applicants in the patent application 
identified above, and are the inventors of the subject matter 
described and claimed in claims 1-8, 10, 11, 13-16, 20, 33, 35 
and 46-69 therein. 

2) We conceived our invention prior to September 28, 2000, 
in Israel, a WTO country. We were then diligent in 
preparation of a provisional patent application covering the 
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invention during the period between September 28, 2000, and 
October 17, 2000, when the provisional patent application (US 
60/240,899) was filed. The present patent application (US 
09/929,877) claims priority from this provisional patent 
application . 

3) As evidence of the conception of the present invention, 
we attach hereto, as Exhibits A and B, parts of a draft of the 
present patent application. These documents were prepared 
September 14, 2000, and September 18, 2000, respectively. 
(Proof of the dates of these documents, as well as other 
documents cited herein, is attached hereto as Exhibit G in the 
form of a directory listing of the archive in which the 
documents were stored. The relevant files and dates in the 
archive are noted below.) 

4) The following tables show the correspondence between 
the independent claims now pending in this application and 
Exhibits A and B. In view of this correspondence, it is clear 
that we conceived the claimed invention prior to September 28, 
2000 . 



Claim 1 


Exhibits 


A method of responding to an 
overload condition at a 
network element ("victim") in 
a set of one or more potential 
victims on a network 


Exhibit A, page 1, paragraph 1: 
^'NetGuard system is activated 
upon receiving alerts of an 
attack. The system than focused 
on defending only the victim (s) 
of the attack . " 
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A. responsively to an 
indication of an anomalous 
traffic condition, initiating 
diversion of traffic destined 
for the victim by a first set 
of one or more network 
elements external to the set 
of one or more potential 
victims to a second set of one 
or more network elements 
external to the set of one or 
more potential victims 


Exhibit A, page 1, paragraph 4: 
"At the time of the attack all 
traffic to the server, which is 
the victim of the attack, is 
navigated to the NetGuard. This 
is done by routing any traffic 
using the victim public address 
to NetGuards . Hence achieving 
our first goal, that traffic 
to the victim, from outside the 
network, and inside the 
network, is redirected to 
NetGuards . " 


B. the element (s) of the 
second set filtering traffic 
diverted in step A ("diverted 
traffic") and selectively 
passing a portion thereof to 
the victim. 


Exhibit A, page 1, last 
paragraph: '^The NetGuards 
machine, discriminates between 
traffic to the victim that is 
part of the attack, and genuine 
traffic. The traffic of the 
attack would be blocked at 
NetGuards . Genuine traffic 
would be routed from the 
NetGuards to the victim, using 
the victim private address.^' 



Claim 4 6 


Exhibits 


A network element for use 
in protecting against an 
overload condition on a 
network 


Exhibit A, page 1, paragraph 1: 
"NetGuard system is activated 
upon receiving alerts of an 
attack. The system than focused on 
defending only the victim (s) of 
the attack." 
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an input for receiving 
traffic diverted from the 
network, the traffic 
comprising flows of data 
packets having respective 
source addresses 


Exhibit A, page 1, paragraph 4: 
"At the time of the attack all 
traffic to the server, which is 
the victim of the attack, is 
navigated to the NetGuard. This is 
done by routing any traffic using 
the victim puhllc a.ddress to 
NetGuards . 

Exhibit B, section 1.1: ""It is 
common (e.g., in the Cisco 
convention) to define a network 
flow by the following parameters: 
i. Source IP address..." 


a statistics module that 
is arranged to perform a 
statistical analysis of 
the diverted traffic so as 
to detect an anomalous 
pattern of a flow 
associated with at least 
one of the source 
addresses 


Exhibit B, section 1.3.2: ''Attack 
Analysis: Will be conducted during 
attack time and will be 
responsible to compare the 
historically collected statistical 
data with the current traffic 
volume and generate rules for 
traffic blockage. The output of 
this unit, in general, will 
consist of a list of items for 
each of which three parameters 
will be provided: 

a. Network floW;. identified by a 
combination of source IP address 
(can be prefixed) , destination IP 
address, destination port number, 
protocol type..," 
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a filter, which is 
operative, responsively to 
detection of the anomalous 
pattern, to block at least 
a portion of the data 
packets having the at 
least one of the source 
addresses 


Exhibit B, section 1.3, last 
paragraph: ''The analysis will be 
based on the statistical 
parameters of the data and will 
aim at keeping the attacked 
destination at normal loads by 
blocking the most ^suspected'' 
traffic streams . " 


an output coupled to the 
input for selectively 
passing on to further 
elements in the network 
traffic not blocked by the 
filter 


Exhibit A, page 1, last paragraph: 
"The NetGuards machine, 
discriminates between traffic to 
the victim that is part of the 
attack, and genuine traffic. The 
traffic of the attack would be 
blocked at NetGuards . Genuine 
traffic would be routed from the 
NetGuards to the victim, using the 
victim private address 



Claim 4 6 


Exhibits 


A system for use in 
protecting against an 
overload condition on a 
network 


Exhibit A, page 1, paragraph 1: 
"NetGuard system is activated 
upon receiving alerts of an 
attack. The system than focused 
on defending only the victim (s) 
of the attack.'' 


one or more network 
elements ("guards") 
disposed on the network 


Exhibit A, page 1, paragraph 4: 
'"At the time of the attack all 
traffic to the server, which is 
the victim of the attack, is 
navigated to the NetGuard . '' 
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an input for receiving 
traffic from the network 


Exhibit A, page 1, paragraph 4: 
'"This is done by routing any 
traffic using the victim public 
address to NetGuards . " 


a filter coupled to the 
inputs the filter 
selectively blocking 
traffic originating from a 
source suspected as 
potentially causing the 
overload condition 


Exhibit section 1.3, last 
paragraph: "The analysis ... will 
aim at keeping the attacked 
destination at normal loads by 
blocking the most 'suspected' 
traffic streams . 


a statistics module that is 
coupled to the filter and 
that identifies the traffic 
statistically indicative of 
having originated from the 
source suspected as 
potentially causing the 
overload condition 


Exhibit B, section 1.3.2: 
'"Attack Analysis: Will be 
conducted during attack: time and 
will be responsible to compare 
the historically collected 
statistical data with the current 
traffic volume and generate rules 
for traffic blockage. The output 
of this unit;^ in general^^ will 
consist of a list of items for 
each of which three parameters 
will be provided: 

a. Network flow, identified by a 
combination of source IP address 
(can be prefixed) , destination IP 
address, destination port number, 
protocol type..." 
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an output coupled to the 
input for selectively 
passing on to further 
elements in the network 
traffic not blocked by the 
filter 


Exhibit A, page 1, last 
paragraph: "The NetGuards 
machine, discriminates between 
traffic to the victim that is 
part of the attack, and genuine 
traffic. The traffic of the 
attack would be blocked at 
NetGuards. Genuine traffic would 
be routed from the NetGuards to 
the victim, using the victim 
prl^rate address . 


one or more further network 
elements ( "diverters " ) 
disposed on the network and 
in communication with the 
guards, the further network 
elements selectively 
initiating, responsively to 
detection of an anomalous 
traffic condition, 
diversion to at least one 
of the guards traffic 
otherwise destined for a 
still further network 
element ("victim") in a set 
of one or more potential 
victims on the network 


Exhibit A, page 1, "routers" 
shown in the figure diverting 
traffic to "NetGuards," as stated 
in paragraph 4 on page 1 : "At 
the time of the attack all 
traffic to the server, which is 
the victim of the attack, is 
navigated to the NetGuard. This 
is done by routing any traffic 
using the victim public address 
to NetGuards . Hence achieving 
our first goal, that traffic to 
the victim, from outside the 
network, and inside the network, 
is redirected to NetGuards . " 
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Claim 56 


Exhibits 


A method of responding to 
an overload condition at a 
network element ("victim") 
in a set of one or more 
potential victims on a 
network 


Exhibit A, page 1;^ paragraph 1; 
"NetGuard system is activated 
upon receiving alerts of an 
attack. The system than focused 
on defending only the victim (s) 
of the attack." 


diverting to a guard 
machine traffic destined 
for the victim^- the traffic 

packets having respective 
source addresses 


Exhibit A, page l^. paragraph 4: 
"'At the time of the attack all 
traffic to the server, which is 

L^IIC V -L -LILL LJJ_ LiiC Cl CI JS. / -L O 

navigated to the NetGuard. This 
is done by routing any traffic 
using the victim public address 
to NetGuards . 

Exhibit B, section 1.1: ^'It is 

common (e.g., in the Cisco 

convention) to define a network 

flow by the following parameters: 

ii . Source IP 
address..." 
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performing a statistical 
analysis of the diverted 
traffic at the guard 
machine so as to detect an 
anomalous pattern of a flow 
associated with at least 
one of the source addresses 


Exhibit B, section 1.3.2: 
Attack Analysis : Will be 
conducted during attack time and 
will be responsible to compare 
the historically collected 
statistical data with the current 
traffic volume and generate rules 
for traffic blockage. The output 
of this unit, in general, will 
consist of a list of items for 
each of which three parameters 
will be provided: 

a. Network flow, identified by a 
combination of source IP. address 
(can be prefixed) , destination IP 
address, destination port number, 
protocol type../' 


a filter, which is 
operative, responsively to 
detection of the anomalous 
pattern, to block at least 
a portion of the data 
packets having the at least 
one of the source addresses 


Exhibit B, section 1.3, last 
paragraph: ''The analysis will be 
based on the statistical 
parameters of the data and will 
aim at keeping the attacked 
destination at normal loads by 
blocking the most 'suspected'' 
traffic streams . 
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responsively to detecting 


Exhibit A, page 1;- last 


the anomalous pattern. 


paragraph: "The NetGuards 


preventing at least a 


machine, discriminates between 


portion of the data packets 


traffic to the victim that is 


having the at least one of 


part of the attack, and genuine 


the source addresses from 


traffic. The traffic of the 


reaching the victim while 


attack would be blocked at 


passing to the victim at 


NetGuards . Genuine traffic would 


least some of the data 


be routed from the NetGuards to 


packets from other source 


the victim, using the victim 


addresses 


private address . 



Claim 66 


Exhibits 


A method of responding 
to an overload condition 
at a network element 
("victim") in a set of 
one or more potential 
victims on a network 


Exhibit A, page 1, paragraph 1: 
"NetGuard system is activated upon 
receiving alerts of an attack- The 
system than focused on defending 
only the victim (s) of the attack. 


coupling the victim to 
receive traffic from the 
network via a first port 
of a network switch 


Exhibit A, page 1: In the figure, 
the victim is coupled to receive 
traffic via one output of a router. 
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actuating the network 
switch to divert the 
traffic destined for the 
victim to a second port 
to which a guard machine 
is coupled 


Exhibit A;, page 1, paragraph 4: ^'At 
the time of the attack all traffic 
to the server, which is the victim 
of the attack, is navigated to the 
NetGuard, This is done by routing 
any traffic using the victim public 
address to NetGuards . " The figure 
shows that the NetGuard is coupled 
to a different port of the router 
from the victim. 


filtering the diverted 
traffic using the guard 
machine 


Exhibit A, page 1, last paragraph: 
^'The NetGuards machine, 
discriminates between traffic to the 
victim that is part of the attack, 
and genuine traffic. The traffic of 
the attack would be blocked at 
NetGuards . 


selectively passing at 
least a portion of the 
filtered traffic from 
the guard machine to the 
victim 


Exhibit A, page 1, last paragraph: 
^^The traffic of the attack would be 
blocked at NetGuards . Genuine 
traffic would be routed from the 
NetGuards to the victim, using the 
victim private address 



5) During the period between September 28 and October 17, 
we worked continuously and diligently to revise and supplement 
the material in the original drafts in order to complete the 
provisional patent application that was subsequently filed. 
Some of the draft documents that we prepared during this 
period are attached hereto as Exhibits C, D, E and F. These 
documents were completed, respectively, on September 29, 
October 2, October 9, and October 13, 2000, We then 
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completed and filed our provisional patent application on 
October 17, 2000. 

6) Exhibit G is a directory listing of the archive from 
which Exhibits A-F were taken. The table below lists the file 
names and dates as they appear in Exhibit G: 



Exhibit 


File Name 


Date 


A 


Netxxn . doc 


September 14, 2 00 0 


B 


Statist ical-patent4 .doc 


September 18, 20 00 


C 


Copy of netxx.doc 


September 29, 2000 


D 


Attack Identification.doc 


October 2, 2000 


E 


Statistical-patent-hanochS 


October 9, 2000 


F 


Mordi . ppt 


October- 13, 2000 



12 



us 09/929,877 

Declaration under 37 C.F.R 1.131 by Afek et al . 

We hereby declare that all statements made herein of our 
own knowledge are true and that all statements made on 
information and belief are believed to be true; and further 
that these statements were made with the knowledge that 
willful false statements and the like so made are punishable 
by fine or imprisonment;- or both, under Section 1001 of Title 
18 of the United States Code and that such willful false 
statements may jeopardize the validity of the application of 
any patent issued thereon. 



Anat Bremler-Barr 
Citizen of Israel 
17 Hashomron Street 
Ramat Hasharon 
Israel 

Date : 



Dan Touitou 
Citizen of Israel 
21 Golani Street 
Ramat Gan 52224 

Israel 

Date: j \0 ] 




Yehuda Afek 
Citizen of Israel 
26 Hacarmel Street 
Hod Hasharon 
Israel 
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EXHIBIT A 



1. Activation netGuards system 

As described above, NetGuard system is activated iipon receiving alerts of an attack. 
The system than focused on defending only the victim(s) of the attack. In the first 
version of the NetGuard system we asstxme that the infomiation about the existing of 
the attack, and the infomiation who is the victim is injected to the system from 
outside. 

Activation the NetGiiard system enforces two impoilant change in the flow of traffic 
to the victim: 

1 . Traffic to the victim, from outside the network, and inside the network, is 
redirected to NetGuards. 

2. Only flow that passes tlirow NetGuard can reach the actual victim. 

In this section we describe in details one of the possible arcMtecture and mechanism 
that acliieve the above goals. 

We give to each server IP two IP addresses. One is the s^rv^x public address and the 
other is the server private address. The server public address is the address of the 
server that is spread in the world, tlirow the DNS mechanism. The server private 
address is the address that known only to tmstable parts of the networks, i.e., the 
NetGuards and to the interfaces of routers that comiected to routers or netGuards (See 
figure 1). In other words, the server private address is not known to router 
interfaces that are comiected to hosts. This give us the ability, to discard packets 
originated fi'om hosts, that uses the sei'^er private address . 

At the time of the attack all traffic to the server, wliich is the victun of the attack, is 
navigated to the NetGuard. Tliis is done by routmg any traffic using the victim public 
address to l>\QtG\x^x&s, Hence acliieviiig our first goal, that traffic to the victun, fi'om 
outside the network, and inside the network, is redirected to NetGuards. We give 
below details of one of the possible ways to i redirect the traffic (see subsection 1.1) 




Figure 1 : The victim private address is known only to trustable paits ia the networks, i.e., the 
interfaces of routers that connected to another router or to netGuards. The routes ia the network where 
the victim private address is known is marked by dashed red lines. 

The NetGuards machine, discriminates between traffic to the victim that is part of the 
attack, and genume traffic. The traffic of the attack would be blocked at NetGuards. 
Genuine traffic would be routed firom the NetGuards to the victim, using the victim 
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private address, Tliis would done by a simple manipulation on the destination field in 
the packets of the genuine traffic. Hence we achieve our second goal, and only flow 
that passes tlirow NetGuard can reach the actual victim. This is due, to the fact that in 
our architecture only traffic originated fi-om NetGixard can vise the victim private 
address^ and hence reach the victim. 

Wlien the attack is ended, we just cancel the redirection to NetGixards, of all the 
traffic to the victim. Hence, again any traffic addressed victim public address can be 
routed directly to the server, which was the victim of the attack, hi the following 
subsection we give the details of the redirection mechanism. 

1.1 Redii-ecting traffic to NetGuards 

Recall, that in case of attack we want to roixte any traffic addressed to the victim 
public address to NetGuards. This can be done by two different methods, one more 
suitable for the traffic that originated mside the network and the other more suitable to 
the traffic that originated outside the network. 

Traffic that originated outside the network, wovtld surly need to pass through one of 
the border routers, i.e., a router that surrounded the network, hi our architecture a 
NetGuard machine is attached to each such border router. Li time of attack, the 
control of the NetGuards system mjected an update to the border router, by updatmg 
the policy routmg mechanism in the router. This update, would notified the border 
router that any packets that is addressed to the victim public address would be forward 
to the NetGuards. Updating the policy routing mechanism, give us the ability to 
change the routing behavior without degrading the border routing performance\ 

Li case the traffic, is originated fi-om inside the network, one could use the same 
mechanism, in order to redirect the traffic addressed to the victhn to the NetGuard. 
However, tlois requned updating the policy routing of all the routers in the network. 
Hence m many cases, it more beneficial to use a different method, that is based on a 
sunple routing manipulation. The designated NetGuard^ that handles the mside traffic 
to the victim would announced its IP address as the victim public address^ while the 



Page: 2 Unlike access list, that required filtering every packet, and hence degrading the router 
performance. Policy routing doesnot hann the routing performance. 

To understand this, we briefly explain the look up process in today routers. Most of the routers use 
Cisco Express Forwarding, or some equivalence mechanism. Using FEC, eveiy interface has a cache 
where it store the information about the next hop for the last packets that arrive throw this interface. 
When packets arrive to the interface card and the destination is not store m the cache, a new forwarding 
process is done. This process is done in the central unit that does the lookup process for all the 
interfaces. This process take into account the routing policy that can be defined per interface. This 
operation is done rarely and in almost all of cases, the lookup operation use the cache information. 
Hence the impact of the degrading in the foiv/arding time is minor. 

^ In some case in order to handle the volume of the mtra network traffic, it may beneficial to use not 
one NetGuards, but a farm of NetGuard However, one should noticed that the problem of attacks, and 
special spoofing attack, in most of the cases is harder when the attack is originated from outside the 
network. When the attack is originated from inside network, there is full uifonnation and management 
of the network. Hence ingress and egress filtering can be used, for dealing with spoofing attack. In 
cases when the origins of the attack are known, one can more easily stop the attack, by disable the 
origins of the attack. 
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victim server, would redraw from this address. This routing updating infomiation, 
wovild spread quickly in the networks, using the standard routing protocol, e.g., 
OSPF, EIGRP or RIP. 



EXHIBIT A 



TCP Anti-spoofing 

We describe here a new anti-spoofing tecliniques which is TCP oiiented. This 
anti-spoofing mechanism aiithenticated the genuine of the source address of the flow, 
based on the SYN mechanism of TCP. Wlien a host wants to open a TCP comiection 
witli tlie server, the hosts sends a SYN request, notified about its wish for a new 
comiection. The server authenticated its source address by sending him back a 
random number. Than, the seiver wait to received this number back the source. 
Naturally a spoofed source camiot repeat the nvimber, and hence any comiection 
between the server and a spoof source, is dismissed. Hence the. SYN phase, which is 
the comiection establislmient phase in TCP ( also called the tliree-way handshake), is 
a naturally anti-spoofing method (see figure 2). 

However, since this mechanism is done by the sei-ver, the SYN mechanism has 
become one of the efficient way to do denial of service attack. SYN-attack, is based 
on the fact that the server get high volume of SYN request. Tliis lead to the fact that 
the buffer, of SYN requests is filled, dismissing any new SYN requests, wliich can be 
a SYN request of a genuine host. Tliis kind of attack also make a huge burden on the 
CPU server. 

hi our arcliitectui'e we built a purposed computer, the NetGuard computers, 
that take the role of the server and do the SYN process. The NetGuards, can deal 
with a liigli volume of traffic, wliich in many cases equal to the bandwidth of the 
Ihiks. The arcliitecture of the NetGuards system also distributed the load of the attack 
in the SYN-attack, on the number entrance points to the network. 

Usmg a special computer for the SYN process, is very naturally solution. Also 
m day to day life, the job of guardmg and gatekeepering is separated fi*om the actual 
activity that is guarded due to performance issue. 



Des1^12.12. 12.12 ,Port=80, Seq=12,345 



Source - 10. 10.10. lO,Poi1?=l234 




Dest^l0.10.10.10,Port=1234,ack-Seq=12,346 
Source=12. 12.12. 12,Port^l830, Seq=64,321 




DestF=12. 12. 12.12 ,Port-1830, ack-Seq=64,322 
Source- 10.10.10,10,Poi-t^l234 



Client- 1010.10.10 Server- 12.12.12.12 



Fig 2: The SYN request. 
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Dest^l2.12.I2.12 ,D-Poi1^80, 
Soiu-ce=10.10.10.10, S-Port=1234,S-Seq=12,345 



Dest=10.10.10.10, D-Port^l234, D-Seq=12,346 
Source=12.12.12.12, S-Poit=1830, S-Seq=64,321 



Dest-12.12.12.12, D-Port^l830, D-Seq=64,322 
Source^ 10.10.10.10, S-Port:=1234, S-Seq=12,347 



Dest-12.12.12.12, D-Port^l830, D-Seq=64,323 
Source=10.10.10.10,S-Port^l234,S-Seq=12.448 

100 byte Data ... 



Dest^lO.10.10.10, D-Port=1234, S-Seq-12,448 
Source=12. 12. 12.12, S-PortF=1830, S-Seq=64,372 

50 Data.. 



Client- 

SYN 

Request 



J 



Dest=12. 12. 12. 1 , D-Port^80 
Source=12.11.1.1,S-PortF=3234,S-Seq=12,345 



Dest=12. 1.1.1 , D-Port=3234, D-Seq=12,346 
Source=12.12.12.1,S-Port=4830, S-Seq=34,231 



Des1^12.12.12.1, D-Port=4830, D-Seq-34,232 
Source=12.1.1.1, S-Port=3234 S-Seq=12,347 



Dest^l2. 12. 12.1, D-Port?=4830, D-Seq=34,233 
Source = 12.1.1.1, S-Port-3234, S-Seq=12,448 

100 byte Data ... 



Dest=12.1.1.1 , D-PoilF=3234, D-Seq=12,448 
Source=12.12.12.1, S-Port^4830, S-Seq=34,281 

50 Data. . . 



NetGuard- 
SYN 
y Request 



Data 
>^ Client 
Server 



Data 
>- Server 
Client 



Client- lO.lO.lO.lO NetGuard - 12.1,1.1 Server- 12,12,12.1 

Play the rale of 12.12.12.12 this address is the 

12.12.12.12 is the sei-ver private address of the 

pubhc address ^. ^^.^^ 



EXHIBIT A 



, filling the biiffer of the sei-ver, with spoofed 



many spoofed sotxrce start the operation of the SYN-attack 



If the source addressed is genaot spoof, than The basic idea is to send back to the 
source a random number. 



Authenticated the source of the flow. Thus distinguish between spoof source 
address to real source address. The Authentication mechanism is a new anti-spoofing 
techniques TCP oriented 



Ability to throw up - 

To divide the work of the routers. 
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1. Attack Identification, Recognition and Isolation via 
the Statistical Recognition Unit 

The statistical recognition unit is responsible for analyzing the attack, identifying its 
origin(s) and providing operational rales for blocking the attack withovit disturbing 
iimocent genuine traffic. The basic principle behind the unit's operation is that the 
pattern of traffic originated at the attack sources during attack time drastically differs 
fi'om that pattern during noiiiial operation, ha contrast, traffic patterns of "innocent" 
sources during an attack resemble those at nomial times. This principle is used to 
identify the attack sources and provide guidelines for their blockage. 

The statistical unit has tliree major components: a) Classification of the victim traffic 
into flows b) Learning the traffic patterns of the vaiious victim flows under nomial 
operation conditions, and c) Monitoring the flows traffic pattern at attack time and 
detecting the attack sources. Below we describe these in detail. 

1.1 Network flows and traffic classification 

The statistical unit operation is based on classifying the traffic into network flows. A 
network flow can be viewed as a stream of packets that share the same properties. It 
is common (e.g., in the Cisco convention) to define a network flow by the following 
parameters: 

i. Source IP address. 

ii. Source port. 

iii. Destination IP address. 

iv. Destmation port. 

V. Traffic type (TCP /UDP/SYN). 

ZZZ will use either this fine classification or a more coarse classification, guided by 
the following considerations: 

i. Disregarding source port: Will be done m the event that source 
port does not ser*ve as a good separator between malicious and 
hinocent traffic. 

ii. Grouping (aggregating) a set of individual source addresses 
into one set (e.g., by considering the IP address prefix): 

Aggregation, if used, will ser"ve to reduce the number of statistics 
measm-ed and computed at attack thne, thus reducing the 
processing complexity. Aggregation can be done in a hierarchical 
mamier. 

iii. Disregarding destination address: At most cases the unit 
operates to block attacks oriented on a single target (e.g. 
www.xYz.com) . hi these cases, the statistical unit will receive as 
input only flows destined to that destination f www.xyz.com) . In 
these cases classifying by destination address is irTelevant. 
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1.2 Traffic Studying During Peace Time 



During "peace time" the unit will actively measure and stiidy the traffic volumes of 
the various flows. Tlais is done in two major modules: 

1. Traffic volume statistical data collection and classification: Tliis 
module operates at "peace" times and is destined to learn the nomial 
traffic volume patterns. This traffic leanxing is done in either (or both) 
of the following approaches: 

i. Sampling a fraction a of the packets (0<a<l) 
traversing the lines on route to the target and then 
classifying the packets based on the following 
parameters: 

Network flow — classification according 
to the classification described in the 
previous sub-section, 
n. Time of the day and day of the week. 

Note that setting a=l requkes the unit to process every packet and 
thus imposes liigh load on it wliile providing the best statistical 
measure. Lower values of a reduce the load posed on the unit wliile 
potentially somewhat degrading the statistical measure. The fi-action 
a, therefore, will be a parameter that will be set so that enough 
statistical knowledge can be gained withoiit over-loading the system. 

Tliis method can be used to classify all traffic type direct to/fi"om the 
defended targets, and requires sensing ("sniffing") the lines on route to 
the destmation. The sniffing devices must be placed as to measure all 
traffic, that is, at the network boundary, or at the defendant target 
proximity. 

ii. Utilizing server logs collected by the defended target. 
These typically contain uifomiation about the activity 
bemg performed on the target. For example, WEB sites, 
which are likely to form the maua body of potential 
targets, keep logs that record all the document requests 
sent to the site (including their source address, time of 
the day and other parameters). Processing of these logs 
by ZZZ will yield a very accurate measure of the 
statistics of network flow volumes (measured in packets 
per second, as in a) above). 



The traffic volume data collected will be summarized and will be stored in a 
database that can be accessed via the various parameters of the flows, 

2, Traffic analysis: Will be conducted at "peace time" and used to 
generate statistical surmnaries of the data collected. In particular the 
processing will be used to compute mean and variance of volumes of 
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each of the flows, or aggregates of flows. The analysis may also 
dynamically change aggregates of flows in order to impi*ove the 
statistical identification of traffic. The results of this analysis will be 
stored in a database to be used at attack time. 



1.3 Traffic Measurement and Analysis at Attack Time 

1- Online traffic volume collection at attack time: This module operates during 
attack times and is responsible to collect the statistics of the traffic at that period. 
The module receives as input only traffic that is destined to the attacked target(s) 
and measures its packet i-ates. Note, that in this sense, its measures are similar to 
the "peace time" measures collected in approach la above. The classification of 
the traffic, in general, is similar to that conducted in "peace time" but may be 
controlled/guided by external intei-vention. Such intervention will be enacted if 
some additional knowledge on the attack type is gained from other sources (e.g., 
human-aided identification) and can be utilized by the unit. 
2. Attack Analysis: Will be conducted durmg attack time and will be 
responsible to compare the historically collected statistical data with the 
current traffic volume and generate rules for traffic blockage. The output 
of this unit, in general, will consist of a list of items for each of which 
tlii-ee parameters will be provided: 

a. Network flow, identified by a combination of source IP address (can be 
prefixed), destination IP address, destination port number, protocol type. 

b. Duration, identifying the duration for which that class will be blocked. 

Tlie analysis will be based on the statistical parameters of the data and will 
aim at keepmg the attacked destination at normal loads by blocking the 
most "suspected" traffic streams. Blocking rules will be based on 
maximizing the likelihood of blocking malicious data while muiimizing 
the likelihood of blocking imiocent data. 

1.4 Statistical Recognition of Data "Innocence" 

ZZZ will use two major properties of network flows to identify whether they are 
malicious or innocent. These are: a) Traffic pattern, and b) Traffic volume. Below 
we describe the recognition approaches based on these factors. 

1.4-1 Recognition of Traffic Pattern 

Several aspects of traffic pattern will be examined: 

1) Source "IP geography" proximity: Soiu'ces will be classified into 
classes that resemble the "IP geography", that is IP addresses that 
reside on similar networks (similar IP address prefix) will be classified 
in the same class. A class that will generate a relatively-large volume 
of requests will be suspected as being malicious. Note that such 
"malicious classes" are likely to foim if the attacker planted a 
collection of daemons in the same network. 
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2) Periodicity: Sources will be examined for the periodicity of their 
requests. It is likely that malicious sources (unless being very 
sopliisticated) will act in a relatively periodic mamier, while imiocent 
soiirces act in more random fashion. 

3) Packet Properties: Soiirces will be examined for repetitive properties 
of their packets. For example the distribution of packet size. It is likely 
that malicious sources (unless very sopliisticated) will generate 
packets of identical properties (e.g. - all packets of same size) wliile 
imiocent sources will generate packets of more random nature. 



1-4-2 Recognition of Traffic Volume 

Traffic volume recognition will be used to identify malicious sources that transmit 
large volumes of data which significantly differ from their normal volume. 
Specifically, we classify Iiiteniet data sources to small sources and large sources. 
The former relates to individual IP addresses whose traffic volume is normally 
tiny. The latter relates to Proxy traffic or Spider traffic^ whose volume is 
drastically higher. 

ZZZ will keep individual volume history for each of the large sources. Individual 
history will not be kept for the small sources; rather a single fixed small number 
(related to their mean volume averaged over all these sources) will be recorded. At 
attack time the traffic volumes of mdividual flows will be measured and compared 
to their recorded volume. Flows whose volume will drastically differ (upwards) 
fi'om their recorded measure will be marked as being malicious. 

The mathematical formulation of this procedure is as follows: Given areX classes 
of flow of traffic, indexed i, 2, K, and characterized by the mean (//.) and 

the variance (cr.) of their historical volume, and by their current volume (X.). 
We would like to identify the classes which mostly deviate fi-om their expected 
volume. Let Y. = (X. - jL£-)/cr. , We will sort the classes by the value of Y. and 

will recommend to block (elimuiate) the classes with the largest values of Y. . 

1.4-2-1 Time accumulating traffic volume recognition and 
"controlled" denial of service 

It is important to recognize that the effectiveness of volume recognition increases 
with the time duration along wliich it is implemented. This is correct since the 
variance of total data volume generated by a source duiing a period of duration T 
decreases in T. For example, it is expected that the average amount of traffic 
generated by a small source during a period of 1 hour will be very small. 
However, at certain epochs, it is expected that the average amount of traffic 
generated by the same source during a period of 1 minute can be rather large, (up 
to 60 times larger than that of the 1 hour average). 



^ The trafiic volume resulting from a Spider access is normally higher than that of a single human user, 
especially on large WEB sites. The reason is that a spider scans the whole site, leading to hundred or 
thousands of requests while a human client requests tens of pages or less, on average. 
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For this reason ZZZ will implement the following unique recognition and traffic 
screening mechanism. For source i, let {t) denote the amount traffic generated 
by the sotirce during the interval (0,t) (where we assume that the attack starts at 
time 0). We then set at time t: X.{t) = S-{i) / 1 and apply the above screening 
mechanism. 

This mechanism has the following properties: 

1. For a small value of t (that is, at the attack 
begimiing moments) a sopliisticated attacker 
might cause significant number of innocent 
users denial of service. Tliis is due to the fact 
that the attacker may inflict a load that 
resembles that of an imiocent client, and thus the 
attacker is not distinguishable fifom the imiocent 
client. At this stage, ZZZ will may block some 
imiocent clients and some attackers. Using this 
action, for a slioit period some imiocent clients 
may be denied of service but ZZZ protects the 
site fi'om going down! 

2. As t increases the denial of service conducted by 
ZZZ will be acted more and more on the 
malicious sources and less and less on the 
innocent sources. This is smce the malicious 
sources have posed large amount of accumulated 
load. Thus as time progresses less and less 
hmocent clients are denied of service, hi fact, 
after relatively short period all malicious sources 
will be denied of service wliile the imiocent 
sources will receive full regular service. 

Example: Consider the traffic volume generated on the web site of the Nagano 
server (Feb 98). It had 11,665,713 requests made over a period of 24 hours by 
59,582 clients. Assuming uniform distribution of clients over the day, this implies 
about 2500 clients per hour and 500 clients per 12-minute interval. An attacker 
who uses 500 sopliisticated daemons (which imitate a nonnal client) will look 
imiocent at the first 12 minutes interval. At this period ZZZ will block 50% of the 
innocent clients and 50% of the daemons. However, after 24 minutes the daemons 
will generate significantly more traffic than an innocent client and thus almost all 
of the traffic blocked will be that of daemons. 
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Distributed Network Defense System 

Yehuda Afek, Hacamiel 26, Hod-Hasharon, Israel, aiid 
Anat Bremler-BaiT, Hashomer 8, Holon, Israel 

1. Abstract 

A new concept and direction in the field of network and computers protection 
firom cyber vandalism and hitemet hackers is presented in this patent. In particular 
our mechanism is a flexible and I'obust method to protect servers and roxiters in the 
hitemet from different distributed denial of sei-vice (DDoS) attacks, enabling 
continuous operation during an attack. Our method enables an ISP to provide 
protection against DDoS to all the servers residing in its autonomous system, thus 
relieving the individual servers from the need to provide protection from DDoS (over 
load and swamp) attacks. Our methods may protect any element m the network: 
computers, routers, servers and other elements. The invention has two major 
components, a guard macliine and the concept of distributing the guard macliines in 
key points in the Internet to acliieve distributed defense against distributed denial of 
service attacks, hi particular the idea is to place guarding machines all around a 
protected area of the network (e.g., an autonomous system), one guardmg machhie for 
each peering point of that area. The invention details the architecture, stracture and 
operations of a guard macliine, and the mechanisms and procedures that enable the 



distribution of the guards to protect paiticular victims in the network. 
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2. Introduction 



Despite numerous distributed denial of service attacks that took place last year 
(with a surge of attacks on YAHOO, CNN, and many other major sites), there is still 
no known online solution that directly protects during an attack. Here we present a 
distributed defense system that does this, enabling the continuous operation of an 
attacked site wliile the attack is going on. 
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Our method of protecting victim elements in the network during an attack and 
enabling them continuous operation despite the attack follows these major steps: 

a. Detection: We assume that an attacked victim has an atxtomatic 
mechanism that detects and alerts when an attack begins. Such a mechanism is 
pi*ovided by different routers, firewall equipment and operating systems. 

b. Alert: Upon suspecting that an attack has began an alert network 
activates a network of guard machines that are located at strategic points around 
an area of the network in wliich the victim resides. For example, the guards could 
be located around the autonomous system that hosts the victim. 

c. Divert: hi addition the alert network invokes a rerouting mechanism 
that ensures first that all the traffic destined to the victim is diverted to the guards 
and second that no packet (message) reaches the victim unless it passed tlirough a 
guard. 

d. Sieve: When an alert aiiives at a guard and the victim traffic is being 
received, the guard sieves the traffic to sort out the "bad" packets and pass on to 
the victim only the "good" ti-affic. The architecture and operation of the special 
pxirpose guard macliine is the second part of this patent and has the following four 
major components (see Figure 1): 

1. Anti-spoofing: An anti-spoofing module that authenticates 
and verifies for each flow (<source-address, source-port, destination- 
port> triplet) that a real process at the host with that source-address and 
behind that port-number has initiated tliis flow. 

2. Statistics: A module that detects and smgles out flows 
(Source IP addresses or subnetworks) with outstanding behavior. The 
identity of these flows is passed to a filter. 

3. Filter: A module that blocks any packet originatmg fi*om an 
IP address or subnetwork that was identified m the previous step as a 
source of malicious traffic. 

4. Ingress filtering: The guard macliines interact with its 
neighboring routers to enable an effective usage of the mgress filteruig 
feature. The routers do not always know which flows they may block 
because of route asymmetry. The guards analysis of the traffic both at 
nonnal operation and during an attack would to pin point which IP 
addresses and addresses blocks may be purged. 

5. Termination detection: All the guards paiticipate m a 
fourth module wliich is a distributed algorithm they run to cooperatively 
decide when an attack has stopped and the victim may retum to peacefiil 
operation mode. This last transition has to hand over the good 
comiections from the guards to the victim. 

Packets flow through a guard machine by first passing tlirough the tWrd 
component, then tlirough the first and finally tlirough the statistical module. 
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Figure 1 : High level block diagram of a guard machine. Thick lines represent the flows of victim 
traffic through the machine while the thinner lines represent conti-ol and other infonnation paths. 
Customers database maintains statistical infonnation about each potential victim traffic patters (each 
customer is a potential victim that the guard would protect). 



In addition we describe how to deal with the open connections wliile transiting 
from nonnal operation to protect mode, and how to deal with the open connections 
while transiting from protect mode back to normal mode (after an attack has been 
stopped). 

The diversion method that passes all the victhn traffic, dming an attack, through 
the guard maclihies is described m the next section, hi Section 4 the antispoofing 
module is described and the 

3 Activating the Guards system 

3 A Redirectins traffic to NetGuards 

Upon suspecting that an attack is being mounted on it, a victhn alerts the guards 
tlirough a communication channel supplied by the backbone provider. For example 
by sendmg authenticated messages to the NOC (Network Operations Center) from 
which the message is relayed to the guards (SNMP and other network management 
means may be used instead). The alert message contains the identity of the victim 
maclihies (which includes their IP addresses). At this point the victim enters the 
"protected" mode. 

hi tliis "protected" mode all the traffic flows to the victim have to be diverted 
such that: 

1. All the traffic whose destination is the victim, from either 
outside the autonomous system (AS) that hosts the victim or fi'om 
inside the AS, is redirected to the guards. 
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2. All the traffic that reaches the victim must have passed 
thi-ough one of the guards, i.e., no traffic can reach the victim without 
going tlxi'ough the guards. 

In this section we describe one possible architecture and mechanism to achieve 
the above goals. 

Two IP addresses are associated with each victim destination machine (sei-vei*) 
the server (victim) public address and the server (victim) private address. The seT^^er 
public address is the IP address of the sei-ver wliich is published all over the Internet 
tln'ough the DNS system. The server private address is an IP address used solely to 
transfer packets between the guards and the victim, while the guards protect the 
victim. Therefore, the server private address is recognized only by either router 
interfaces that are connected to intemal links of the hosting AS backbone, or to the 
guards, or to the victim interface. All other interfaces, such as, comiections to links 
that come fi'om outside the AS, or links that are comiected to other customers of the 
AS (stub networks), discard any packet whose destination is the server private IP 
address (easily and efficiently acliieved by using the CEF mechanism of the routers). 
(See figure 1). Tliis ensures that no hacked daemon can generate traffic to the victim 
private IP address. 

To acliieve the above setting all the interfaces that are connected to external links, 
i.e., links that comiect to either other networks (AS's) or to external hosts and 
customer networks, are pemianently progi'aimned to discard traffic destined to the 
server private IP address. In nomial operation when no attack is being mounted, the 
victim declares itself to be at distance zero fi'om both the server private IP address 
and the seiner public IP address, Tliis causes the routing protocol to set entries in the 
foi-wardhig tables in all the AS routers, to forward messages destined to either address 
to the victim (which is now not a victim) machines. 

To divert public IP address victhn traffic that arrives fi'om outside the hosting AS 
during an attack we notice that all such traffic must pass through one of the border 
routers, i.e., a peering or NAP, BGP routers. A guard macliine is placed in each entry 
next to the boarder routers at this point. Upon receiving the alert of a possible attack 
on a victhn all these boarder routers are set to forward all the traffic arriving from out 
of the network (AS) and whose destination address is the victim public IP address, to 
the guard machine which is placed next to them. This is easily achieved by injecting 
an update to the boarder routers, updating their policy routing mechanism. Updatmg 
the policy routing mechanism, gives us the ability to change the routing behavior 
without degrading the border routing performance^ hi effect the guarding machines 



Unlike access list, that required filteriag every packet, and hence degrading the router performance, 
Policy routing doesnot harm the routmg perfonnance. 

To understand this, we briefly explain the look up process in today routers. Most of the routers use 
Cisco Express Forwardhig, or some equivalence mechanism. Using FEC, every interface has a cache 
where it store the information about the next hop for the last packets that arrive throw this interface. 
When packets arrive to the interface card and the destination is not store in the cache, a new forwarding 
process is done. This process is done in the central unit that does the lookup process for all the 
interfaces. This process take into account the routing policy that can be defined per interface. This 
operation is done rarely and in almost all of cases, the lookup operation use the cache information. 
Hence the impact of the degrading in the forwarding time is minor. 
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become a TCP proxy for the victim. All the traffic retuniing from the victim to 
triTsted clients is passed tlii'ougli the coiTesponding guarding machine. 

To divert victim public IP address traffic that originates inside the hosting AS to 
the intemal or boarder guai'ds, one coiald use a similar mechanism. That is, to inject 
when the alert is received^, the desired routing information into all the routers. 
However, tliis reqmres updating the policy routing of all the routers in the AS. hi 
many cases (large networks), it is more beneficial to use a different method, based on 
a simple routing manipulation, hi this method, when the victim suspects that an 
attack is being applied, it declai-es itself to be at a large distance fi'oni the server 
(victim) public IP address, while the guards would start to declare that they are at 
distance zero (or close to zero) fi-om the sei'^^er (victim) public IP address,^ This 
routing updates quickly spread in the AS network, using the standard routing protocol 
(usually a linlc state type of protocol), e.g., OSPF, EIGRP or RIP^ Thus within 
seconds fi-om these declarations all the victim traffic is automatically diverted to the 
guards. 

Wlien the guards decide (see how below) that the attack has tenninated they send 
an appropriate message to the victim machuie. At the same time they reverse the 
above settings, that is, they stop declarmg that theh distance fi-om the server (victim) 
public IP address is zero, while the victhn starts declarmg agam that it is at distance 
zero fi-om its public IP address. 

Notice that in the "protect" mode several guards may all claim to be at distance 
zero fi-om the victim public IP address. This divides the AS into clusters, such that 
packets with tliis destmation address hi each cluster are routed to the guard residing 
witliui that cluster. However, there might be routers on the boarder between two or 
more such clusters with equal distance to two or more guards. This may introduce 
routing instability, where some packets of a flow go to one guard and some packets go 
to the other guard. First notice that this effects only victhn traffic that originates 
inside the hosting AS. The victim traffic that arrives from outside the hosting AS is 
treated by the guard at the entry point wliich acts as a proxy for that traffic. Thus 
outside traffic would suffer fi-om route flapping only if these flapping are introduced 
by BGP, which is very rare. To avoid the flapping of victim traffic originating inside 
the AS, we set each guard to declare that it is at a very small but different distance 
from the victhn public IP address. This small perturbations ensure that no router 
would be at equal distance fi"om two guards. The exact calculation of tliis 
perturbation is automatically calculated given the ISP map of its backbone. 



^ In some case in order to handle the volume of the intra network traffic, it may beneficial to use not 
one NetGuai-ds, but a farm of NetGuard However, one should noticed tliat tlie problem of attacks, and 
special spoofing attack, in most of the cases is harder when the attack is originated from outside the 
network. When the attack is originated from inside network, there is full infonnation and management 
of the network. Hence ingress and egress filtering can be used, for dealing with spoofing attack. In 
cases when the origins of the attack are known, one can more easily stop the attack, by disable the 
origins of the attack. 

^ Unlike BGP, these routuig protocols adapt veiy quickly to topological changes, thus correcting the 
forwarding tables ui all the routers in hundreds of milliseconds. 
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Figure 2: The victim private address is known only to trustable parts in the networks, i.e., the 
interfaces of routers that connected to another router or to netGuards. The routes in the network where 
the victim private address is known is marked by dashed red lines. 

4 TCP Anti-spoofing 

We describe here an aiati-spoofing techniques wliich is TCP oriented. The basic 
principle of the idea was presented before by Checkpoint and Cisco (TCP intersept). 
The innovation here is in the way it is combmed with the other mechanisms and its 
employment on the boarders of the hosting AS. Tliis anti-spoofmg mechanism 
authenticates the genuine of the source address of the flow, based on the SYN 
mechanism of TCP. Doing so each guard in effect becomes a very low level TCP 
proxy for the victim. When a client wishes to open a TCP coimection with a server, it 
sends a SYN request, notifying the server about its attempt to open a new coimection. 
The server authenticates the client source address by sendmg the client a random 
number. Then, the server waits to receive this random number back from the source. 
A spoofed source camiot repeat the number, and hence any comiection between the 
server and a spoofed source is dismissed, (see figure 2). 

The SYN mechanism or the connection establishment (tliree way handshake) is 
also one of the known denial of service attack methods, hi this attack a luige number 
of spoofed SYN-requests are being sent to the server. Each such request must be 
buffered and kept by the server for a period of thne (30 seconds by the standard) until 
its corresponding SYN-ACK is received. The SYN-request buffer at the server 
overfills which at worse brings the server down and at best causing the server to loose 
good genuine requests to open new comiections. 

Each of our guard machines is a special purpose macliine that among other things 
perfoims the comiection establishment on behalf of the victim. Being special purpose 
it can handle a huge number of cormections at very high speeds (supporting OC-192 
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lines). Moreover, the distributed arcliitecture of our system distributes the load 
among the different guard machines. 

hi the next two figures we show the sequence of messages diuing the three way 
handshake. The first figure. Figure 3, shows the normal sequence of messages during 
the tliree way handshake between a client whose IP address is 12.12.12.12. and a 
server whose IP address is 10.10.10.10. hi Figure 4 the same process is perfoniied 
but now the SYN request message is intercept by the guard machine which then 
perfomis the tliree way handshake on behalf of the server. Only after the guard 
macliine receives the coiTcct SYN-ACK message fi-om the client it opens the 
corresponding comiection with the server and staiis to fiuiction as a proxy between 
the client and the sei"ver. 



Des1?=12. 12. 12.12 ,Port=80, Seq=12,345 



Source = 10. 10.10. 10,Port^l234 




Dest=10.10.10.10,Port^l234,ack-Seq-12,346 
Source=12.12.12.12,Port^l830, Seq-64,321 




Dest=12.12.12.12 ,Port?=1830, ack-Seq=64,322 
Source^ 10.10.10.10,Port=1234 



Client- 1010.10.10 Server- 12.12.12.12 



Fig 3: The SYN request. 
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Dest=12.12.12.12 ,D-Poi-t^80, 
Source=10.10.10.10, S-Port=1234,S-Seq=12,345 



Dest=10.10.10.10, D-Port?=1234, D-Seq=12,346 
Soiirce=12.12.12.12, S-Pon^l830, S-Seq-64,321 



Dest=12.12.12.12, D-Port-1830, D-Seq=64,322 
Source^ 10.10.10.10, S-Port^l234, S-Seq=12,347 



Dest^l2.12.12.12, D-Porl^l830, D-Seq=64,323 
Source-10.10.10.10,S-Port=1234,S-Seq=12.448 

100 byte Data ... 



Dest-10.10.10.10, D-Port=1234, S-Seq=12,448 
Som-ce-12.12.12.12, S-Port^l830, S-Seq=64,372 

50 Data.. 



Client- 

SYN 

Request 



Desf^l2.12,12.1, D-Poi-t-80 

Source=l 2. 1 1.1. 1 ,S-Pon?=3234,S-Seq=12,345 



Dest?=12. 1.1.1 , D-Port=3234, D-Seq-12,346 
Source=12.12.12.1,S-Port-4830, S-Seq=34,231 



Des1=12.12.12.1, D-Por1^4830, D-Seq=34,232 
Soiirce=12.1.1.1, S-Poi1^3234 S-Seq=12,347 



Dest^l2.12.12.1, D-Port^4830, D-Seq=34,233 
Source = 12.1.1.1, S-Porf^3234, S-Seq=12,448 

100 byte Data ... 



Dest^l2. 1.1.1 , D-Port^3234, D-Seq-12,448 
Source=12.12.12.1, S-Poi1-4830, S-Seq=34,281 

50 Data. . . 



NetGuard- 
SYN 
^ Request 



Data 
y CHent 
Server 



Data 
>- Server 
Client 



Client- 10.10.10.10 



1 NetGuard - 
12.1.1.1 

Play the rale of 12.12.12.12 
12.12.12.12 is the server 
public address 



Server- 12.12.12.1 

this address is the 
private address of the 
victun 
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5. Ingress filtering 

All bla bal bla bla bla atxthentication and was not stopped by the 

6. Attaclc Identification, Recognition and Isolation via the Statistical 
Recognition Unit. 

All the victun ti'affic that has passed the anti-spoofing authentication and was not 
stopped by the filter flows tlirough the statistical unit. The statistical unit analyzes the 
traffic and identifies malicious sources (i.e., compromised sources), and provides 
operational mles for blocking the attack without distui'bing imiocent genuine traffic. 
The basic principle beliind the unit's operation is that the pattern of traffic originating 
from a black-hat daemon drastically differs from the pattern generated by such a 
source duiing nornial operation, hi contrast, traffic pattems of "imiocent" sources 
during an attack resemble then traffic at normal times. This ptinciple is used to 
identify the attack sources and provide guidelines for their blockage by either the 
filter or the access lists of the routers. For example, the volume of a traffic from an 
attacking daemon, the distiibution of packet sizes, port numbers, the distribution of 
the packets inter arrival times, and the ratio of inbound and outbound traffic are all 
parameters that may indicate that a source (client) is an attackhig daemon. 

The statistical unit has two major tasks: 

1. Learning the traffic pattems during normal operation, i.e., when no attack is 
bemg mounted. These pattems are used while defending a victim during an attack 
to compare with the actual traffic in order to distmguish the malicious traffic from 
genuine traffic. We consider tlaree possible ways ha wliich this learning can be 
done: (1) Using the routers NetFlow data, (2) Analyzing the server logs at the 
victhn server, and (3) Analyzing the potential victhn traffic at the guard by 
havmg the traffic diverted to the guard from time to time for randomly sampling 
it. 

2. Monitoring the victhn traffic during an attack to identify and isolate the 
malicious traffic from the good genuine traffic. The identity of the attacking host 
is then given to the filter or the neighboring routers that would then drop any 
packet arrivmg fi-om that host. 



5.1 Network flows and traffic classification 

The basic element studied by the statistical unit is a flow. Each flow is a 
sequence of packets belonging to the same comiection. hi the most general way a 
flow is identified by the following parameters: Source IP address, Source port, 
Destination poit. Protocol type, time of day and day of week of comiection creation. 
The destination IP address is implied since we collect all the hiformation per 
destination addi"ess. For each such flow the traffic volume is registered. 
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Keeping all of the above infomiation is infeasible since it requires an 
imacceptable amount of memory. However we employ learning methods to study the 
basic characteristics of the traffic destined to each destination and keep these key 
parameters succinctly, in an efficient way. Essentially the learning method stiidies the 
typical behavior of gi'oups of users that interact with the destination. For example, a 
typical web site is accessed either by individual users sitting behind a host (pc), by a 
group of users sharing one multi user time sharing host, or by a gi'oup of users sitting 
beloind a proxy. For each such group its typical behavior is studied. Other types of 
users are possible such as web crawlers (for search engines) and monitoring servers 
such as keynote r www.keynote.com ). Furthemiore, for the largest gi'oup, i.e. the group 
of individual users, their identities (DP address) is not kept. Each source which is not 
included in the other groups is assumed to be an individual source. On the other hand, 
for the group of proxy machines that access the destination, the individual IP address 
of each is kept in a trie like data-structure. For other groups of users only their IP 
address may be needed since their traffic would be blocked from the begimiing during 
an attack. Henceforth, the rest of this section considers types of users and the 
chai'acteristic of flows originating from such users. 

The basic parameters characterizing each user group are: 

1 . Traffic volume distribution: These mclude the mean and variance of tlie 
traffic such a user generates. 

2. Port numbers distribution: Source port number distribution, and 
destination port number distribution. 

3. Periodicity: Sources will be exaiimied for the periodicity of theh requests. 
It is likely that malicious sources act m a relatively periodic manner, wliile 
umocent sources act not in a regular manner. 

4. Packet Properties: The distribution of packet sizes. 

5.2 Learnins Traffic Characteristics 

There are tliree possible ways, that we consider, to leam and analyze the traffic 
characteristics of a paiticular target: 

1. Sampling a fraction a of the packets (0<a <1) traversing the lines on 
route to the tai'get and then classifying the packets according to the flow id and 
time of day and day of week. 

Notice that setting a =1 requires the unit to process every packet and thus imposes 
high load on it wliile providing the best statistical measure. Lower values of a 
reduce the load posed on the unit wliile potentially somewhat degi'ading the 
statistical measure. The fraction a, therefore, will be a parameter that will be set 
so that enough statistical kxaowledge can be gained without over-loading the 
system. 

2. Utilizing server logs collected by the defended target. These typically 
contam infomiation about the activity being applied on the target. For example, 
WEB sites, which are likely to form the main body of potential targets, keep logs 
that record all tlie document requests sent to the site (including their source 
address, tune of the day and other parameters). Processing of these logs by ZZZ 
yields a very accurate measure of the statistics of network flow volumes 
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(nieasiured in packets per second, as in a) above). The potential draw backs of this 
method are first that being collected at the target it is not iimiiediately clear which 
infomiation is relevant to which guarding point, and second, the pattern seen by 
the target may be slightly different fi'om the pattern seen at the network boarder. 
However neither is a real problem and the first one may be a feature, since 
network routes change and the traffic may enter the network fi*om a different point 
in any event. 

3. Analyzing netflow data collected fi-om the appropriated routers. This 
option requires the backbone provider to enable netflow and process it with our 
learning applications. This method has some limitation but none seems 
proliibitory. The limitations are that netflow aggregates infoimation for each flow 
in intervals of a few minutes (typically 5 minutes intei-vals), and in tliis intervals it 
does not maintain the sizes of individual packets. Rather, it counts the total 
number of packets and bytes passed in tlois interval for each flow. 



5.5 Traffic Monitoring and Analysis at Attack Time 

In attack time wMle the guard machine defends a victim it monitors the victim traffic, 
classifies its traffic (mcomhig and outgoing) and compares the traffic to the normal 
traffic in order to detect the malicious ti'affic. Notice that during an attack 
infomiation is collected only on the cun-ent flows. The infomiation about well 
behavmg flows is not kept more than small number of minutes. 

1. Online traffic volume collection at attack time: Tliis module collects 
the statistics of the traffic destined to the target(s) in attack time. 
Notice that in this sense, its measures are similar to the measures 
collected m approach li above. The classification of the traffic, in 
general, is similar to that conducted in the learning phase but may be 
controlled/guided by extemal intervention. Such intei*vention is 
enacted if some additional knowledge on the attack type is gained from 
other sources (e.g., human- aided identification) and can be utilized by 
the unit. 

2. Attack Analysis: Is conducted m attack tune and is responsible to 
compare the statistical data leamed with the current traffic volume and 
generate rules for traffic blockage. The output of tliis unit, in general, 
will consist of a list of items for each of which tliree parameters will be 
provided: 

a. Network flow, identified by a combination of source IP address (can 
be prefixed), destination IP address, destmation port number, protocol 
type (one may consider blockage that disregards port numbers, i.e., all 
the traffic originating fi-om a compromised IP address, be it a proxy or 
a host). 

b. Duration, identifying the duration for which that class will be 
blocked. 

The analysis is based on the statistical parameters of the data and aims 
at keeping the target traffic at normal loads by blocking the most 
"suspicious" traffic streams. Blocking rules ai-e based on maximizing 
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the likelihood of blocking malicious traffic while minimizing the 
likelihood of blocking innocent traffic. 

5.3.1 Statistical Recognition of Data "Innocence" 

NETGUARDS uses two major properties of network flows to identify 
malicious traffic: a) Traffic pattem, and b) Traffic volume. Below we describe the 
recognition approaches based on these factors. 

5.3.1.1 Recognition of Traffic Pattern 

Several aspects of traffic pattem are examined: 

1) Source "IP geography" proximity: Sources will be classified 
into classes that resemble the "IP geography", that is IP addresses that 
reside on neighboring networks (using IP address prefix) are classified 
in the same class. A class that generates a relatively large volume of 
requests is suspected as being malicious. Notice, that such "malicious 
classes" are likely to fonn if the attacker planted a collection of 
daemons in the same network, and this network does not use a proxy. 

2) Periodicity: Sources will be examined for the periodicity of 
their requests. It is likely that malicious sources act in a relatively 
periodic manner, while innocent sources act in more irregular pattem. 

3) Packet Properties: Sources will be examined for repetitive 
properties of then" packets. For example the distribution of packet size. 
It is likely that malicious sources generate packets of identical 
properties (e.g. — all packets of same size) while imiocent sources will 
generate packets of more random nature. Other properties include port 
number distributions. 



5.3.1.2 Recognition of Traffic Volume 

Traffic volume recognition is used to identify malicious sources that transixdt 
large volumes of data which significantly differ from their noimal volume. 
Specifically, we classify hitemet data sources to small sources and large sources. 
The fonner relates to individual IP addresses whose traffic volume is normally 
tiny. The latter relates to Proxy traffic or Spider (Crawler) traffic"^ whose volume 
is drastically higher. 

ZZZ keeps individual volume parameters for each of the large sources. 
Individual parameters are not kept for the small sources; rather a single fixed 
small number (related to their mean volume averaged over all these sources) will 
be recorded. At attack time the traffic volumes of individual flows will be 
measured and compared to their recorded volume. Flows whose volume 
drastically differs (upwards) fi'om their recorded measure are marked as being 
malicious. 



The traffic volume resulting from a Spider access is normally liiglier than tliat of a single human user, 
especially on large WEB sites. The reason is tliat a spider scans the whole site, leadmg to hundred or 
thousands of requests wliile a human client requests tens of pages or less, on average. 
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The mathematical fonmilation of this procedure is as follows: Given are K 
classes of flows, indexed 1, 2, ,,,, K, and characterized by the mean (//.) and the 

variance (cr.) of their learned volume, and by their cuirent volume (X.). We 
would like to identify the classes with the largest deviation from their 
coiTesponding expected volume. Let = (X. - jUf)/ • We will sort the classes 
by the value of Y- and recommend blocking the classes with the largest values of 

5.3.2 Time accmnulating traffic volume recognition 

and "controlled" denial of service 

It is important to recognize that the effectiveness of volume recognition 
increases with the tune duration along wliich it is implemented. Tlais is correct 
smce the variance of total data volume generated by a source during a period of 
duration T decreases in T. For example, it is expected that the average amount of 
traffic generated by a small source during a period of 1 hour will be very small. 
However, at certain epochs, it is expected that the average amount of traffic 
generated by the same souixe during a period of 1 minute can be rather large, (up 
to 60 times larger than that of the 1 hour average). 

For this reason ZZZ will implement the following unique recognition and 
traffic screening mechanism. For source i, let S.(t) denote the amount traffic 
generated by the source during the interval (Oj) (where we assume that the attack 
starts at time 0). We then set at time t: X-(t) = S.{t) / 1 and apply the above 
screening mechanism. 

Tliis mechanism has the following properties: 

1 . For a small value of t (that is, at the first 
few minutes of the attack) a sopliisticated 
attacker might cause significant number of 
innocent users denial of service. Tliis is due to 
the fact that the attacker may inflict a load that 
resembles that of an imiocent client, and thus the 
attacker is not distmguishable fi-om the imiocent 
client. At this stage, ZZZ may block some 
imiocent clients and some attackers. Using this 
action, for a short period of time, some imiocent 
clients may be denied of service but ZZZ 
protects the site from going down. 

2. As t increases more and more malicious 
sources are identified and blocked and fewer 
imiocent sources ai'e blocked. This is since the 
malicious sources have posed large amount of 
accumulated load. Thus as time progresses less 
and less uiiiocent clients are denied service, hi 
fact, after relatively short period all malicious 
sources will be denied service wliile the 
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imiocent sources will receive full regular 
sei-vice. 

Example: Consider the traffic volume generated on the web site of the 
Nagano server (Feb 98). It had 11,665,713 requests made over a period of 24 
hours by 59,582 chents. Assuming unifomi distribution of clients over the day, 
this implies about 2500 clients per hour and 500 clients per 12-minute interval. An 
attacker who uses 500 soploisticated daemons (which imitate a nomial client) will 
look imiocent at the first 12 minutes interval. At this period ZZZ will block 50% 
of the imiocent clients and 50% of the daemons. However, after 24 minutes the 
daemons will generate significantly more traffic than an imiocent client and thus 
almost all of the traffic blocked will be that of malicious daemons. 
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5. Attack Identification, Recognition and Isolation via the Statistical 

Recognition Unit. 

All the victim traffic that has passed the anti-spoofiiig authentication and was not 
stopped by the filter flows tln-ough the statistical unit. The statistical unit analyzes the 
traffic and identifies malicious sources (i.e., compromised sources), and provides 
operational mles for blocking the attack without disturbing innocent genuine traffic. 
The basic principle beliind the unit's operation is that the pattern of traffic originating 
fi*om a black-hat daemon drastically differs fi'om the pattem generated by such a 
source during nomaal operation, ha contrast, traffic pattems of "imiocent" sources 
duiing an attack resemble their traffic at nonnal times. Tliis principle is usod to 
identify the attack sources and provide guidelines for their blockage by either the 
filter or the access lists of the routers. For example, the volume of a traffic fi'om an 
attacking daemon, the distribution of packet sizes, port numbers, the distribiition of 
the packets inter airival times, and the ratio of inbound and outbound traffic are all 
parameters that may indicate that a source (client) is an attacking daemon. 

The statistical unit has two major tasks: 

1. Learning the traffic pattems during normal operation, i.e., when no attack is 
bemg mounted. These pattems are used while defending a victim duiing an attack 
to compare with the actual traffic in order to distinguish the malicious traffic fi-om 
genume traffic. We consider tlaree possible ways ua which tliis learning can be 
done: (1) Using the routers NetFlow data, (2) Analyzing the server logs at the 
victhn server, and (3) Analyzmg the potential victmi traffic at the guard by 
having the traffic diverted to the guard fi-om tune to time for randomly sampling 
it. 

2. Monitormg the victhn traffic during an attack to identify and isolate the 
malicious traffic from the good genume traffic. The identity of the attacking host 
is then given to the filter or the neighboring routers that would then drop any 
packet arriving fi-om that host. 



5.1 Network flows and traffic classification 

The basic element studied by the statistical unit is a flow. Each flow is a 
sequence of packets belonging to the same comiection. hi the most general way a 
flow is identified by the following parameters: Source IP address, Source port, 
Destmation port, Protocol type, tune of day and day of week of connection creation. 
The destination IP address is implied smce we collect all the information per 
destination address. For each such flow the traffic volume is registered. 

Keepmg all of the above information is infeasible since it requires an 
unacceptable amount of memory. However we employ learning methods to study the 
basic characteristics of the traffic destined to each destination and keep these key 
parameters succmctly, in an efficient way. Essentially the learning method studies the 
typical behavior of groups of users that interact with the destination. For example, a 
typical web site is accessed either by individual users sitting behind a host (pc), by a 
group of users sharing one multi user thne sharing host, or by a group of visers sitting 
behind a proxy. For each such group its typical behavior is studied. Other types of 



1 



EXHIBIT D 



tisers are possible such as web crawlers (for search engines) and monitoring servers 
snch as keynote f www.kevnote. com ). Furthermore, for the largest group, i.e. the gi'oup 
of individual users, their identities (IP address) is not kept. Each source which is not 
included in the other gi-oups is assumed to be an individual source. On the other hand, 
for the group of proxy machines that access the destination, the individual IP address 
of each is kept in a trie like data-stmcture. For other groups of users only their IP 
address may be needed since their traffic would be blocked from the begimiing diiring 
an attack. Henceforth, the rest of this section considers types of users and the 
characteristic of flows originating from such users. 

The basic parameters characterizing each user group are: 

1. Traffic volume distribution: These include the mean and variance of the 
traffic such a user generates. 

2. Port numbers distribution: Source port number distribution, and 
destination port number distribution. 

3. Periodicity: Sources will be examined for the periodicity of their requests. 
It is likely that malicious sources act in a relatively periodic manner, wliile 
imiocent sources act not in a regular mamier. 

4. Packet Properties: The distribution of packet sizes. 

5.2 Learning Traffic Characteristics 

There are tliree possible ways, that we consider, to leam and analyze the traffic 
characteristics of a particular target: 

1. Samplmg a fraction a of the packets (0<a <1) traversmg the lines on 
route to the target and then classifyuag the packets according to the flow id and 
tune of day and day of week. 

Notice that settmg a=l requires the unit to process every packet and thus imposes 
high load on it while providing the best statistical measure. Lower values of a 
reduce the load posed on flie unit while potentially somewhat degrading the 
statistical measure. The fraction a , therefore, will be a parameter that will be set 
so that enough statistical knowledge can be gained without over-loading the 
system. 

2. Utilizing server logs collected by the defended target. These typically 
contain information about the activity being applied on the target. For example, 
WEB sites, which are likely to form the mam body of potential targets, keep logs 
that record all the document requests sent to the site (includmg thefr source 
address, time of the day and other parameters). Processing of these logs by ZZZ 
yields a very accurate measure of the statistics of network flow volumes 
(measured in packets per second, as in a) above). The potential draw backs of tliis 
method are first that being collected at the target it is not umiiediately clear which 
information is relevant to wliich guarding point, and second, the pattem seen by 
the target may be slightly different fi"om the pattem seen at the network boai-der. 
However neither is a real problem and the first one may be a feature, since 
network routes change and the traffic may enter the network from a different point 
in any event. 
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3. Analyzing netflow data collected fi"om the appropriated routers. This 
option requires the backbone provider to enable netflow and process it with our 
learning applications. This method has some limitation but none seems 
proliibitory. The limitations are that netflow aggregates infonnation for each flow 
in intervals of a few minutes (typically 5 miiuites intei-vals), and in this intei'vals it 
does not maintain the sizes of individual packets. Rather, it counts the total 
number of packets and bj^es passed in this intei-val for each flow. 



5.3 Traffic Monitoring and Analysis at Attack Time 

In attack time wliile the guard macliine defends a victim it monitors the victim traffic, 
classifies its traffic (incoming and outgoing) and compares the ti^affic to the nomial 
ti'affic in order to detect the malicious traffic. Notice that during an attack 
infonnation is collected only on the cuixent flows. The infomaation about well 
behaving flows is not kept more than small number of minutes. 

1. Online traffic volume collection at attack time: This module collects 
the statistics of the traffic destmed to the target(s) in attack time. 
Notice that in tins sense, its measures are similar to the measures 
collected m approach li above. The classification of the traffic, m 
general, is similar to that conducted m the learning phase but may be 
controlled/guided by external intervention. Such intervention is 
enacted if some additional knowledge on the attack type is gamed fi*om 
other sources (e.g., human-aided identification) and can be utilized by 
the unit. 

2. Attack Analysis: Is conducted in attack time and is responsible to 
compare the statistical data learned with the current traffic volume and 
generate rules for traffic blockage. The output of tliis unit, m general, 
will consist of a list of items for each of which tliree parameters will be 
provided: 

a. Network flow, identified by a combination of source IP address (can 
be prefixed), destination IP address, destination port number, protocol 
type (one may consider blockage that disregards port numbers, i.e., all 
the traffic originating fi'om a compromised IP address, be it a pr-oxy or 
a host). 

b. Duration, identifying the duration for which that class will be 
blocked. 

The analysis is based on the statistical parameters of the data and aims 
at keeping the target traffic at normal loads by blocking the most 
"suspicious" traffic streams. Blocking rules are based on maximizing 
the likelihood of blocking malicious traffic while minimizing the 
likelihood of blockuig imiocent traffic. 

5,3.1 Statistical Recognition of Data "Innocence" 

NETGUARDS uses two major properties of network flows to identify 
malicious traffic: a) Traffic pattem, and b) Traffic volume. Below we describe the 
recognition approaches based on these factors. 
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5.3.1.1 Recognition of Traffic Pattern 

Several aspects of traffic pattern are examined: 

1) Source "IP geography" proximity: Sources will be classified 
into classes that resemble the "IP geography", that is IP addresses that 
reside on neighboring networks (using IP address prefix) ai'e classified 
in the same class. A class that generates a relatively large volume of 
requests is suspected as being malicious. Notice, that such "malicious 
classes" are likely to form if the attacker planted a collection of 
daemons in the same network, and this network does not use a proxy. 

2) Periodicity: Sources will be examined for the periodicity of 
their requests. It is likely that malicious sources act in a relatively 
periodic manner, while imiocent sources act in more irregular pattern. 

3) Packet Properties: Sources will be examined for repetitive 
properties of their packets. For example the distribution of packet size. 
It is likely that malicious sources generate packets of identical 
properties (e.g. — all packets of same size) wliile iimocent sources will 
generate packets of more random nature^ Other properties include port 
number distributions. 



5.3.1.2 Recognition of Traffic Volume 

Traffic volume recognition is used to identify malicious sources that transmit 
large volumes of data which significantly differ fi-om their normal volume. 
Specifically, we classify hitemet data sources to small sources and large sources. 
The former relates to mdividual IP addresses whose traffic volume is normally 
tmy. The latter relates to Proxy traffic or Spider (Crawler) traffic^ whose volume 
is drastically higher. 

ZZZ keeps individual volume parameters for each of the large sources. 
Individual parameters are not kept for the small sources; rather a smgle fixed 
small number (related to their mean volume averaged over all these sources) will 
be recorded. At attack time the traffic volumes of individual flows will be 
measi-ired and compared to their recorded volume. Flows whose volume 
drastically differs (upwards) from their recorded measure are marked as being 
malicious. 

The mathematical fonnulation of this procedure is as follows: Given are K 
classes of flows, indexed 1, 2, K, and characterized by the mean (//.) and the 

variance (cr.) of their learned vohime, and by their current volume (X.). We 
would like to identify the classes witli the largest deviation from their 
corresponding expected volume. Let Y. = (X. - • We will sort the classes 



The traffic volume resulting from a Spider access is normally liigher than that of a single human user, 
especially on large WEB sites. The reason is that a spider scans the whole site, leading to hundred or 
thousands of requests while a human client requests tens of pages or less, on average. 
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by the value of Y- and recommend blocking the classes with the largest values of 

5.3.2 Time accumulating traffic volmne recognition 

and "controlled" denial of service 

It is important to recognize that the effectiveness of vokxme recogixition 
increases with the time duration along wliich it is implemented. TMs is conect 
since the variance of total data volume generated by a source during a period of 
duration T decreases in T. For example, it is expected that the average amount of 
traffic generated by a small soui'ce during a period of 1 hour will be vejy small. 
However, at certain epochs, it is expected that the average amount of traffic 
generated by the same source during a period of 1 minute can be rather large, (up 
to 60 times larger than that of the 1 hour average). 

For tliis reason ZZZ will implement the following unique recognition and 
traffic screening mechanism. For source /, let S.(t) denote the amount traffic 
generated by the source during the interval (0,t) (where we assume that the attack 
starts at time 0). We then set at time t: X-{i) - S-(t) / 1 and apply the above 
screening mechanism. 

This mechanism has the followmg properties: 

1 . For a small value of t (that is, at the first 
few minutes of tire attack) a sopliisticated 
attacker might cause significant number of 
imiocent users denial of service. This is due to 
the fact that the attacker may inflict a load that 
resembles that of an hinocent client, and thus the 
attacker is not distinguishable firom the imiocent 
client. At this stage, ZZZ may block some 
iimocent clients and some attackers. Usmg tliis 
action, for a short period of time, some imiocent 
clients may be denied of service but ZZZ 
protects the site from going down. 

2. As ^ increases more and more malicious 
sources are identified and blocked and fewer 
imiocent sources are blocked. This is since the 
malicious soiirces have posed large amount of 
accumulated load. Thus as time progresses less 
and less imiocent clients are denied service, hi 
fact, after relatively short period all malicious 
sources will be denied service while the 
imiocent sources will receive fiall regular 
service. 

Example: Consider the traffic volume generated on the web site of the 
Nagano server (Feb 98). It had 11,665,713 requests made over a peiiod of 24 
hours by 59,582 clients. Assuming unifomi distribution of clients over the day. 
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this implies about 2500 clients per hour and 500 clients per 12-miniTte interval. An 
attacker who rises 500 sophisticated daemons (which imitate a normal client) will 
look innocent at the first 12 minutes intei-val. At this period ZZZ will block 50% 
of the imiocent clients and 50% of the daemons. However, after 24 minutes the 
daemons will generate significantly more traffic than an imiocent client and thus 
almost all of the ti*affic blocked will be that of malicious daemons. 
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5. Attack Identification, Recognition and Isolation via the Statistical 
Recognition Unit. 

All the victim traffic that has passed the aiiti-spoofing authentication and was not 
stopped by the filter flows tlirongh the statistical unit. The statistical unit analyzes the 
traffic and identifies malicious sources (i.e., compromised sources), and provides 
operational mles for blocking the attack without disturbing imiocent genuine traffic. 
The basic principle behind the unit's operation is that the pattern of ti*affic originating 
fi-om a black-hat daemon drastically differs fi'om the pattem generated by such a 
source during nomial operation. In contrast, traffic patterns of "imiocent" sources 
during an attack resemble their traffic at nomial times. Tliis principle is used to 
identify the attack sources and provide guidelines for their blockage by either the 
filter or the access lists of the routers. For example, the volume of a traffic from an 
attacking daemon, the distribution of packet sizes, port numbers, the distribution of 
the packets inter amval times, and the ratio of inbound and outbound traffic are all 
parameters that may indicate that a soixrce (client) is an attacking daemon. 

The statistical unit has two major tasks: 

1. Learning the traffic patterns during normal operation, i.e., when no 
attack is being mounted. These pattems are used wliile defending a victim during 
an attack to compare with the actual traffic m order to distinguish the malicious 
traffic firom genuine traffic. We consider three possible ways in wliich tliis 
learning can be done: (1) Using the routers NetFlow data, (2) Analyzing the 
server logs at the victim server, and (3) Analyzing the potential victim traffic at 
the guard by having the traffic diverted to the guard fi*om time to thne for 
randomly sampling it. 

2. Monitoring the victun traffic durmg an attack to identify and isolate 
the malicious traffic firom the good genuine traffic. The identity of the attackmg 
host is then given to the filter or the neighboring routers that would then drop any 
packet arriving fi-om that host. 



5.1 Network flows and traffic classification 

The basic element studied by the statistical unit is a flow. Each flow is a 
sequence of packets belonging to the same comiection. hi the most general way a 
flow is identified by the foUowmg parameters: Source DP address. Source port. 
Destination port, Protocol type, time of day and day of week of coimection creation. 
The destination IP address is hnplied smce we collect all the information per 
destination address. For each such flow the traffic volume is registered. 

Keeping all of the above information is infeasible since it requires an 
unacceptable amount of memory. However we employ learning methods to study the 
basic characteristics of the traffic destined to each destination and keep these key 
parameters succinctly, in an efficient way. Essentially the learning method studies the 
typical behavior of gi-oups of users that interact with the destination. For example, a 
typical web site is accessed either by individual users sitting behind a host (pc), by a 
group of users sharing one multi user tune sharing host, or by a group of users sitting 
behind a proxy. For each such group its typical behavior is studied. Other types of 
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users are possible such as web crawlers (for search engines) and monitoring servers 
such as keynote ( www.keynote.com ). For the individtxal itsers, their identities (IP 
address) are not kept to save storage. Each source wliich is not inclxxded in the other 
gi'oups is assumed to be an individual source. On the other hand, for the grovip of 
proxy machines that access the destination, the individual IP address of each is kept in 
a tiie like data-structure. For other grotxps of txsers only their IP address may be 
needed since their traffic would be blocked from the begimiing during an attack. 
Henceforth, the rest of this section considers types of users and the characteristic of 
flows originating fi-om such users. 

The basic parameters chai'actei'izing each user group are: 

1. Traffic volume distribution: These include the mean and variance of the 
traffic such a user generates. 

2. Port numbers distribution: Source port number distribution, and destination 
port number distribution. 

3. Periodicity: Sources will be examined for the periodicity of their requests. It is 
likely that malicious sources act in a relatively periodic mamier, while innocent 
sources act not in a regular' mamier. 

4. Packet Properties: The distribution of packet sizes. 

5.2 Learnins Traffic Characteristics 

There are tliree possible ways, that we consider, to leam and analyze the traffic 
characteristics of a particular target: 

1. Sampling a fraction a of the packets (0< a <1) traversing the Imes on route 
to the target and then classifying the packets according to the flow id and time of 
day and day of week. 

Notice that setting a=l requires the unit to process every packet and thus imposes 
high load on it wliile providing the best statistical measure. Lower values of a 
reduce the load posed on the unit wliile potentially somewhat degradmg the 
statistical measure. The fi'action a, therefore, will be a parameter that will be set 
so that enough statistical knowledge can be gamed without over-loadmg the 
system. 

2, Utilizmg server logs collected by the defended target. These typically contain 
information about the activity being applied on the target. For example, WEB 
sites, wliich are likely to fomi the main body of potential targets, keep logs that 
record all the document requests sent to the site (including their source address, 
time of the day and other parameters). Processing of these logs by ZZZ yields a 
very accurate measure of the statistics of network flow volumes (measrured in 
packets per second, as m a) above). The potential draw backs of this method are 
first that being collected at the target it is not iimnediately clear wliich information 
is relevant to wliich guarding point, and second, the pattern seen by the target may 
be sliglitly different fi'om the pattem seen at the network border. However neither 
is a real problem and the first one may be a feature, smce network routes change 
and the traffic may enter the network fi-om a different point in any event. 
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3. Analyzing netflow data collected fi'om the appropriated routers. This option 
I'eqnires the backbone provider to enable netflow and process it with our learning 
applications. This method has some limitation but none seems proliibitory. The 
limitations are that netflow aggregates infomiation for each flow in intervals of a 
few minutes (typically 5 minutes intei-vals), and in this intervals it does not 
maintain the sizes of individual packets. Rather, it counts the total number of 
packets and bytes passed in this interval for each flow. 



5,3 Traffic Monitorins and Analysis at Attack Time 

In attack time while the guard machine defends a victim it monitors the victmi traffic, 
classifies its traffic (incoming and outgoing) and compares the traffic to the normal 
traffic in order to detect the malicious traffic. Notice that during an attack 
information is collected only on the current fiows. The infomiation aboixt well 
behaving flows is not kept more than small number of minutes. 

1. Online traffic volume eollection at attack time: This module collects the 
statistics of the traffic destined to the target(s) in attack time. Notice that in 
this sense, its measures are shnilar to the measures collected in approach li 
above. The classification of the traffic, in general, is similar to that conducted 
in the learning phase but may be controlled/guided by extemal intervention. 
Such intervention is enacted if some additional knowledge on the attack type 
is gained fi-om other sources (e.g., human-aided identification) and can be 
utilized by the umt. 

2. Attack Analysis: Is conducted m attack thne and is responsible to compare 
the statistical data learned with the current traffic volume and generate rules 
for traffic blockage. The output of this unit, in general, will consist of a list of 
items for each of wliich three parameters will be provided: 

a. Network flow, identified by a combmation of source IP address (can be 
prefixed), destination IP address, destination port number, protocol type (one 
may consider blockage that disregards port numbers, i.e., all the traffic 
origuiating fi'om a compromised IP address, be it a proxy or a host). 

b. Duration, identifying the duration for which that class will be blocked. 

The analysis is based on the statistical parameters of the data and auns at 
keeping the target traffic at normal loads by blocking the most "suspicious" 
and "harmful" traffic streams. Blocking rules are based on maximizing the 
likelihood of blocking malicious traffic while minimizing the likelihood of 
blocking imiocent traffic. 

5.3.1 Statistical Recognition of Data "Imiocence" 

NETGUARDS uses two major properties of network flows to identify 
malicious traffic: a) Traffic pattern, and b) Traffic volume. Below we describe the 
recognition approaches based on these factors. 

5.3.1.1 Recognition of Traffic Pattern 
Several aspects of traffic pattern are examined: 
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1) Source "IP geography" proximity: Sources will be classified into 
classes that resemble the "IP geography", that is IP addresses that reside on 
neighboring networks (using IP address prefix) are classified in the same class. 
A class that generates a relatively large volume of requests is suspected as 
being malicious. Notice, that such "malicious classes" are likely to foim if the 
attacker planted a collection of daemons in the same network, and this 
network does not use a proxy. 

2) Periodicity: Sources will be examined for the periodicity of their 
requests. It is likely that malicious sources act in a relatively periodic mamier, 
while imiocent sources act in more irregular pattem. 

3) Packet Properties: Sources will be examined for repetitive properties 
of their packets. For example the distribution of packet size. It is likely that 
malicious sources generate packets of identical properties (e.g. - all packets of 
same size) wliile imiocent sources will generate packets of more random 
nature. Other properties include port number distributions. 



5.3.1.2 Recognition of Traffic Volume 

Traffic volume recognition is used to identify malicious sources that transmit 
large volumes of data wliich significantly differ fi'om their nomial voluiue. 
Specifically, we classify Intemet data sources to small sources and large sources. 
The former relates to individual IP addresses whose traffic volume is nonnally 
tiny. The latter relates to Proxy traffic or Spider (Crawler) traffic^ whose volume 
is drastically higher. 

ZZZ keeps individual volume parameters for each of the large sources. 
Individual parameters are not kept for the small sources; rather a shagle fixed 
small number (related to their mean volume averaged over all these sources) will 
be recorded. At attack time the traffic volumes of individual flows will be 
measured and compared to their recorded volume. Flows whose vokxme 
drastically differs (upwards) fi'om their recorded measure are marked as being 
malicious. 

The mathematical fonuulation of this procedure is as follows: Given are K 
classes of flows, indexed 1,2, K, and characterized by the mean (//.) and the 

variance {a-) of their learned volume, and by their current volume (X.). We 
would like to identify the classes with the largest deviation fi'om their 
conesponditag expected volume. Let = (X. — jll^)/ a. . We will sort the classes 

by the value of Y. and reconrmend blocking the classes with the largest values of 
Y^ . Blockage of soiirces will be done seqtxentially until the total volume of 
sources fits the traffic volume to be sustained by the network/defended site. 



The traffic volume resulting frora a Spider access is nonnally higlier than that of a single human user, 
especially on large WEB sites. The reason is that a spider scans the whole site, leading to hundred or 
thousands of requests while a human client requests tens of pages or less, on average. 
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5.3.2 Time accumulating traffic volume recognition 
and "controlled" denial of service 

It is important to recognize that the effectiveness of volume recognition 
increases with the time duration along wliich it is implemented. Tliis is con'ect 
since the vaiiance of total data volume generated by a source during a period of 
duration T decreases in T. For example, it is expected that the average amount of 
traffic generated by a small source during a period of 1 hour will be very small. 
However, at certain epochs, it is expected that the average amount of traffic 
generated by the same source during a period of 1 minute can be rather large, (up 
to 60 times larger than that of the 1 hour average). 

For this reason ZZZ will implement the following unique recognition and 
traffic screening mechanism. For source i, let S^{t) denote the amotmt traffic 
generated by the source during the mteival (0,t) (where we assume that the attack 
starts at time 0). We then set at time t: X^{t) ~ S^{t) / 1 and apply the above 
screening mechanism. 

This mechanism has the foUowmg properties: 

1 . For a small value of t (that is, at the first few minutes of the attack) a 
sophisticated attacker might cause significant number of innocent users denial 
of service. Tliis is due to the fact that the attacker may inflict a load that 
resembles that of an imiocent client, and thus the attacker is not 
distinguishable from the irmocent client. At tliis stage, ZZZ may block some 
hinocent clients and some attackers. Using tliis action, for a short period of 
time, some imiocent clients may be denied of service but ZZZ protects the site 
fi^om gomg down. 

2. As t mcreases more and more malicious sources are identified and 
blocked and fewer hinocent sources are blocked. This is since the malicious 
sources have posed large amount of accumulated load. Thus as tune 
progresses less and less imiocent clients are denied sei-vice. In fact, after 
relatively short period all malicious sources will be denied service while the 
hinocent soui'ces will receive frill regular service. 

Example: Consider the traffic volume generated on the web site of the 
Nagano server (Feb 98). It had 11,665,713 requests made over a period of 24 
hours by 59,582 clients. Assuming unifomi distribution of clients over the day, 
tliis huplies about 2500 clients per hour and 500 clients per 12-muiute mterval. An 
attacker who uses 500 sophisticated daemons (which imitate a normal client) will 
look innocent at the first 12 minutes mterval. At tliis period ZZZ will block 50% 
of the innocent clients and 50% of the daemons. However, after 24 minutes the 
daemons will generate significantly more traffic than an innocent client and thus 
almost all of the traffic blocked will be that of lualicious daemons. 
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